Florida is the swing state with the most votes. After going to Bush (republican) in 2008, it went to Obama (democrat) in 2008. Lets look deeper into the financial contribution breakdowns for this election.

Description of columns can be found here

Load the data

First some preprocessing to download and parse the file

wget ftp://ftp.fec.gov/FEC/Presidential_Map/2008/P00000001/P00000001-FL.zip
unzip P00000001-FL.zipx
mv P00000001-FL.csv 2008-florida-funding.csv
sed -i 's/,$//' 2008-florida-funding.csv # remove trailing comma

Univariate Analysis

Next lets get a summary of the data and data.tables (more efficient)

## [1] 208331     18
##      cmte_id   cand_id        cand_nm       contbr_nm        contbr_city
## 1: C00431809 P80003478 Huckabee, Mike RAULERSON, JUDY           LAKELAND
## 2: C00431809 P80003478 Huckabee, Mike RAULERSON, JUDY           LAKELAND
## 3: C00431809 P80003478 Huckabee, Mike RAULERSON, JUDY           LAKELAND
## 4: C00431809 P80003478 Huckabee, Mike RAULERSON, JUDY           LAKELAND
## 5: C00431809 P80003478 Huckabee, Mike RAULERSON, JUDY           LAKELAND
## 6: C00431809 P80003478 Huckabee, Mike REDICK, RICHARD JACKSONVILLE BEACH
##    contbr_st contbr_zip       contbr_employer contbr_occupation
## 1:        FL      33813 CARILLON LAKES REALTY   SALES COUNSELOR
## 2:        FL      33813 CARILLON LAKES REALTY   SALES COUNSELOR
## 3:        FL      33813 CARILLON LAKES REALTY   SALES COUNSELOR
## 4:        FL      33813 CARILLON LAKES REALTY   SALES COUNSELOR
## 5:        FL      33813 CARILLON LAKES REALTY   SALES COUNSELOR
## 6:        FL      32250         SELF-EMPLOYED       REAL ESTATE
##    contb_receipt_amt contb_receipt_dt receipt_desc memo_cd memo_text
## 1:                50       2007-08-03                               
## 2:                50       2007-08-06                               
## 3:                50       2007-08-14                               
## 4:                50       2007-08-29                               
## 5:                25       2007-09-28                               
## 6:               250       2007-08-14                               
##    form_tp file_num     tran_id election_tp
## 1:   SA17A   326973 SA17A.18515       P2008
## 2:   SA17A   326973 SA17A.18663       P2008
## 3:   SA17A   326973 SA17A.21140       P2008
## 4:   SA17A   326973 SA17A.23592       P2008
## 5:   SA17A   326973 SA17A.31677       P2008
## 6:   SA17A   326973 SA17A.21224       P2008
##    cmte_id            cand_id            cand_nm         
##  Length:208331      Length:208331      Length:208331     
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##   contbr_nm         contbr_city         contbr_st        
##  Length:208331      Length:208331      Length:208331     
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##   contbr_zip        contbr_employer    contbr_occupation 
##  Length:208331      Length:208331      Length:208331     
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##  contb_receipt_amt  contb_receipt_dt     receipt_desc      
##  Min.   :-12800.0   Min.   :2005-01-11   Length:208331     
##  1st Qu.:    30.0   1st Qu.:2008-01-31   Class :character  
##  Median :   100.0   Median :2008-05-20   Mode  :character  
##  Mean   :   284.2   Mean   :2008-04-26                     
##  3rd Qu.:   250.0   3rd Qu.:2008-09-04                     
##  Max.   : 12800.0   Max.   :2008-12-31                     
##    memo_cd           memo_text           form_tp             file_num     
##  Length:208331      Length:208331      Length:208331      Min.   :246469  
##  Class :character   Class :character   Class :character   1st Qu.:346097  
##  Mode  :character   Mode  :character   Mode  :character   Median :753761  
##                                                           Mean   :601414  
##                                                           3rd Qu.:754317  
##                                                           Max.   :877004  
##    tran_id          election_tp       
##  Length:208331      Length:208331     
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 

What is the structure of your dataset?

My dataset has ~21k rows and 18 columns. The columns are mostly strings with only one numeric field of interest. This field of interest is donations amount (contb_receipt_amt) and quartiles for donations are between $30 and $250. Quartiles for when donation happened was between January 31st and September 4th of 2008.

Next, I will investigate distribution of the different character columns

## [1] 21
## [1] 20
## [1] 20
##                          
##                           C00420224 C00423202 C00430470 C00430512
##   Biden, Joseph R Jr              0         0         0         0
##   Brownback, Samuel Dale          0         0         0         0
##   Clinton, Hillary Rodham         0         0         0         0
##   Cox, John H                    12         0         0         0
##   Dodd, Christopher J             0         0         0         0
##   Edwards, John                   0         0         0         0
##   Gilmore, James S III            0         0         0         0
##   Giuliani, Rudolph W             0         0         0      5817
##   Gravel, Mike                    0        35         0         0
##   Huckabee, Mike                  0         0         0         0
##   Hunter, Duncan                  0         0         0         0
##   Kucinich, Dennis J              0         0         0         0
##   McCain, John S                  0         0     31939         0
##   Obama, Barack                   0         0         0         0
##   Paul, Ron                       0         0         0         0
##   Richardson, Bill                0         0         0         0
##   Romney, Mitt                    0         0         0         0
##   Tancredo, Thomas Gerald         0         0         0         0
##   Thompson, Fred Dalton           0         0         0         0
##   Thompson, Tommy G               0         0         0         0
##                          
##                           C00430694 C00430827 C00430975 C00431171
##   Biden, Joseph R Jr              0         0         0         0
##   Brownback, Samuel Dale        296         0         0         0
##   Clinton, Hillary Rodham         0         0         0         0
##   Cox, John H                     0         0         0         0
##   Dodd, Christopher J             0         0         0         0
##   Edwards, John                   0         0         0         0
##   Gilmore, James S III            0         0         0         0
##   Giuliani, Rudolph W             0         0         0         0
##   Gravel, Mike                    0         0         0         0
##   Huckabee, Mike                  0         0         0         0
##   Hunter, Duncan                  0         0         0         0
##   Kucinich, Dennis J              0         0       240         0
##   McCain, John S                  0         0         0         0
##   Obama, Barack                   0         0         0         0
##   Paul, Ron                       0         0         0         0
##   Richardson, Bill                0         0         0         0
##   Romney, Mitt                    0         0         0      5792
##   Tancredo, Thomas Gerald         0         0         0         0
##   Thompson, Fred Dalton           0         0         0         0
##   Thompson, Tommy G               0        42         0         0
##                          
##                           C00431205 C00431288 C00431379 C00431411
##   Biden, Joseph R Jr              0         0         0         0
##   Brownback, Samuel Dale          0         0         0         0
##   Clinton, Hillary Rodham         0         0         0         0
##   Cox, John H                     0         0         0         0
##   Dodd, Christopher J             0         0       330         0
##   Edwards, John                3723         0         0         0
##   Gilmore, James S III            0         2         0         0
##   Giuliani, Rudolph W             0         0         0         0
##   Gravel, Mike                    0         0         0         0
##   Huckabee, Mike                  0         0         0         0
##   Hunter, Duncan                  0         0         0        84
##   Kucinich, Dennis J              0         0         0         0
##   McCain, John S                  0         0         0         0
##   Obama, Barack                   0         0         0         0
##   Paul, Ron                       0         0         0         0
##   Richardson, Bill                0         0         0         0
##   Romney, Mitt                    0         0         0         0
##   Tancredo, Thomas Gerald         0         0         0         0
##   Thompson, Fred Dalton           0         0         0         0
##   Thompson, Tommy G               0         0         0         0
##                          
##                           C00431445 C00431569 C00431577 C00431619
##   Biden, Joseph R Jr              0         0         0         0
##   Brownback, Samuel Dale          0         0         0         0
##   Clinton, Hillary Rodham         0     33889         0         0
##   Cox, John H                     0         0         0         0
##   Dodd, Christopher J             0         0         0         0
##   Edwards, John                   0         0         0         0
##   Gilmore, James S III            0         0         0         0
##   Giuliani, Rudolph W             0         0         0         0
##   Gravel, Mike                    0         0         0         0
##   Huckabee, Mike                  0         0         0         0
##   Hunter, Duncan                  0         0         0         0
##   Kucinich, Dennis J              0         0         0         0
##   McCain, John S                  0         0         0         0
##   Obama, Barack              107907         0         0         0
##   Paul, Ron                       0         0         0         0
##   Richardson, Bill                0         0      1214         0
##   Romney, Mitt                    0         0         0         0
##   Tancredo, Thomas Gerald         0         0         0       485
##   Thompson, Fred Dalton           0         0         0         0
##   Thompson, Tommy G               0         0         0         0
##                          
##                           C00431809 C00431916 C00432914 C00438507
##   Biden, Joseph R Jr              0       606         0         0
##   Brownback, Samuel Dale          0         0         0         0
##   Clinton, Hillary Rodham         0         0         0         0
##   Cox, John H                     0         0         0         0
##   Dodd, Christopher J             0         0         0         0
##   Edwards, John                   0         0         0         0
##   Gilmore, James S III            0         0         0         0
##   Giuliani, Rudolph W             0         0         0         0
##   Gravel, Mike                    0         0         0         0
##   Huckabee, Mike               2699         0         0         0
##   Hunter, Duncan                  0         0         0         0
##   Kucinich, Dennis J              0         0         0         0
##   McCain, John S                  0         0         0         0
##   Obama, Barack                   0         0         0         0
##   Paul, Ron                       0         0      5851         0
##   Richardson, Bill                0         0         0         0
##   Romney, Mitt                    0         0         0         0
##   Tancredo, Thomas Gerald         0         0         0         0
##   Thompson, Fred Dalton           0         0         0      2266
##   Thompson, Tommy G               0         0         0         0
##                          
##                           C00446104
##   Biden, Joseph R Jr              0
##   Brownback, Samuel Dale          0
##   Clinton, Hillary Rodham         0
##   Cox, John H                     0
##   Dodd, Christopher J             0
##   Edwards, John                   0
##   Gilmore, James S III            0
##   Giuliani, Rudolph W             0
##   Gravel, Mike                    0
##   Huckabee, Mike                  0
##   Hunter, Duncan                  0
##   Kucinich, Dennis J              0
##   McCain, John S               5102
##   Obama, Barack                   0
##   Paul, Ron                       0
##   Richardson, Bill                0
##   Romney, Mitt                    0
##   Tancredo, Thomas Gerald         0
##   Thompson, Fred Dalton           0
##   Thompson, Tommy G               0

McCain has two committee ids

## 
##           Obama, Barack          McCain, John S Clinton, Hillary Rodham 
##                  107907                   37041                   33889 
##               Paul, Ron     Giuliani, Rudolph W 
##                    5851                    5817
## 
##           MIAMI           TAMPA         ORLANDO    JACKSONVILLE 
##           15414            8351            7342            6393 
##      BOCA RATON        SARASOTA          NAPLES     TALLAHASSEE 
##            6262            6098            6039            6008 
## FORT LAUDERDALE     MIAMI BEACH     GAINESVILLE    CORAL GABLES 
##            4523            4190            3812            3702
## 
##     FL 
## 208331
## 
## 331354726     33139     33133     33480 333212562     33156 320433443 
##       398       246       232       194       185       156       152 
##     33131     33134 329513361     33629     33432 
##       150       147       141       140       121
## 
## 33139 33133 33156 33480 33143 33134 32963 32312 32789 33140 33432 33176 
##  2453  2336  2154  2124  1711  1697  1465  1464  1454  1248  1208  1202

Cleaned up zip codes, otherwise geography data looks as expected

## 
##                           NOT EMPLOYED 
##                                  41845 
##                          SELF EMPLOYED 
##                                  18968 
##                                RETIRED 
##                                  16844 
##                                        
##                                  12148 
##                          SELF-EMPLOYED 
##                                   6255 
##                  INFORMATION REQUESTED 
##                                   4940 
## INFORMATION REQUESTED PER BEST EFFORTS 
##                                   3865 
##                                   NONE 
##                                   1930 
##                              HOMEMAKER 
##                                   1476 
##                                   SELF 
##                                   1462 
##                  UNIVERSITY OF FLORIDA 
##                                    939 
##                    UNIVERSITY OF MIAMI 
##                                    784 
##                       STATE OF FLORIDA 
##                                    740 
##                             UNEMPLOYED 
##                                    540 
##                                    N/A 
##                                    441
## 
##                     NOT EMPLOYED                    SELF EMPLOYED 
##                            43775                            26685 
##                                                           RETIRED 
##                            21394                            16844 
##                        HOMEMAKER            UNIVERSITY OF FLORIDA 
##                             1476                              939 
##              UNIVERSITY OF MIAMI                 STATE OF FLORIDA 
##                              784                              740 
##                       UNEMPLOYED         FLORIDA STATE UNIVERSITY 
##                              540                              379 
## FLORIDA INTERNATIONAL UNIVERSITY                        REQUESTED 
##                              333                              300
## 
##                                RETIRED 
##                                  56324 
##                               ATTORNEY 
##                                  10401 
##                           NOT EMPLOYED 
##                                   6633 
##                              HOMEMAKER 
##                                   5892 
##                                        
##                                   5635 
##                  INFORMATION REQUESTED 
##                                   4454 
##                              PHYSICIAN 
##                                   4038 
## INFORMATION REQUESTED PER BEST EFFORTS 
##                                   3824 
##                             CONSULTANT 
##                                   2785 
##                                TEACHER 
##                                   2408 
##                              PROFESSOR 
##                                   2401 
##                              PRESIDENT 
##                                   2134

Occupation and employer data is quite messy, a lot of people did not answer. I used blank (“”) to indicate no response. I cleaned it up a bit based on what the top values were. There seems to be a lot of “President” (occupation) who donate. Most of the donors seem to fall under “NOT EMPLOYED”, “SELF EMPLOYED”, “BLANK” or “RETIRED”.

## 
##  SA17A   SA18  SB28A 
## 190693  14343   3295
## 
##         G2006  G2008  P2008 
##     38      1  60695 147597
##      cmte_id   cand_id             cand_nm     contbr_nm contbr_city
## 1: C00431379 P80003387 Dodd, Christopher J THOMAS, DEROY  PALM COAST
##    contbr_st contbr_zip contbr_employer contbr_occupation
## 1:        FL  321371202                           RETIRED
##    contb_receipt_amt contb_receipt_dt receipt_desc memo_cd memo_text
## 1:              2100       2006-09-18                    X          
##    form_tp file_num              tran_id election_tp contbr_zip2
## 1:    SA18   860973 A45C7EB04EC694B87839       G2006       32137

Somehow a donation for 2006 general election snuck into this dataset. I removed it. Not sure why some election type values are blank (38)

Add the party affiliation for each of the candidates and shortened Hillary’s name so make plotting labels easier.

What is/are the main feature(s) of interest in your dataset?

The main feature of interest is donation amount (contb_receipt_amt).

What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

Supporting features include candidate name (cand_nm), donation type (election_tp), donor name (contbr_nm), location (city, zip), work (employer, occupation), date of donation (contb_receipt_dt)

Did you create any new variables from existing variables in the dataset?

Yes, I added a party field and a proper zip code field.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

I cleaned up the zip codes and tried to clean up occupation and employer fields to little success. I converted date field to date object. Donation amount field is sometimes negative.

Overall contributions

Next, lets investigate overall contributions to the candidate and the party

Assuming that within the same location (city/zip/employer/occupation) each donor can be uniquely identified by name.

##                     cand_nm election_tp       Total
##  1:      Biden, Joseph R Jr       P2008   402160.80
##  2:      Biden, Joseph R Jr       G2008    43200.00
##  3:  Brownback, Samuel Dale       P2008    94257.97
##  4:        Clinton, Hillary       P2008  9312563.28
##  5:        Clinton, Hillary       G2008   303727.94
##  6:             Cox, John H       P2008     1815.00
##  7:     Dodd, Christopher J       P2008   351009.10
##  8:     Dodd, Christopher J       G2008    22300.00
##  9:           Edwards, John       P2008  1097018.71
## 10:           Edwards, John       G2008   189944.10
## 11:           Edwards, John                 -250.00
## 12:    Gilmore, James S III       P2008     1025.00
## 13:     Giuliani, Rudolph W       P2008  4734594.91
## 14:     Giuliani, Rudolph W       G2008  -696779.18
## 15:            Gravel, Mike       P2008    16595.00
## 16:          Huckabee, Mike       P2008  1181612.94
## 17:          Huckabee, Mike               -32167.16
## 18:          Hunter, Duncan       P2008    22100.00
## 19:      Kucinich, Dennis J       P2008    36977.77
## 20:      Kucinich, Dennis J                  700.00
## 21:          McCain, John S       P2008 12735326.32
## 22:          McCain, John S       G2008  2218187.95
## 23:           Obama, Barack       P2008 10919619.36
## 24:           Obama, Barack       G2008  9407917.07
## 25:           Obama, Barack                  300.00
## 26:               Paul, Ron       P2008  1231651.99
## 27:        Richardson, Bill       P2008   796248.79
## 28:        Richardson, Bill       G2008     1000.00
## 29:            Romney, Mitt       P2008  4083366.18
## 30:            Romney, Mitt       G2008  -305375.00
## 31: Tancredo, Thomas Gerald       P2008    49048.00
## 32:   Thompson, Fred Dalton       P2008   923130.16
## 33:   Thompson, Fred Dalton       G2008    -4880.00
## 34:       Thompson, Tommy G       P2008    60200.00
## 35:       Thompson, Tommy G                 -500.00
##                     cand_nm election_tp       Total

Unsurprisingly, the candidates who raised the most money were the ones that ran for president (McCain and Obama). Multiple candidates with negative donations had large donations during primary but dropped out during general election.

##                     cand_nm Party       Total
##  1:          McCain, John S     R 12735326.32
##  2:           Obama, Barack     D 10919619.36
##  3:        Clinton, Hillary     D  9312563.28
##  4:     Giuliani, Rudolph W     R  4734594.91
##  5:            Romney, Mitt     R  4083366.18
##  6:               Paul, Ron     R  1231651.99
##  7:          Huckabee, Mike     R  1181612.94
##  8:           Edwards, John     D  1097018.71
##  9:   Thompson, Fred Dalton     R   923130.16
## 10:        Richardson, Bill     D   796248.79
## 11:      Biden, Joseph R Jr     R   402160.80
## 12:     Dodd, Christopher J     D   351009.10
## 13:  Brownback, Samuel Dale     R    94257.97
## 14:       Thompson, Tommy G     R    60200.00
## 15: Tancredo, Thomas Gerald     R    49048.00
## 16:      Kucinich, Dennis J     D    36977.77
## 17:          Hunter, Duncan     R    22100.00
## 18:            Gravel, Mike     D    16595.00
## 19:             Cox, John H     R     1815.00
## 20:    Gilmore, James S III     R     1025.00

There is a large difference in donations from florida during the primary between the McCain and the other Republican candidates.

Focusing on the top fundraisers from the primary, you can see that contributions for the runnerups dried up or went negative (donors withdrawing money) when they did not win the primary. Contributions for the two candidates that won the primaries (Obama and McCain) kept increasing.

Many small donations, some are negative.

##    Party    Total
## 1:     R 26741976
## 2:     D 32455671

Overall, democrats raised more money from florida than republicans

Largest contributions

Next, lets have a look at which groups donated the most.

##                            contbr_nm          cand_nm        contbr_city
##  1:                CASTRO, JUANITA C Clinton, Hillary       CORAL GABLES
##  2:         SCRIBANTE, LYNDA H. MRS.   McCain, John S            SANIBEL
##  3:        BOLANOS, ZOILA GLORIA MS.   McCain, John S             WESTON
##  4:            SMITH, REBECCA J. MS.   McCain, John S              TAMPA
##  5:                 BALZOLA, GABRIEL    Obama, Barack              MIAMI
##  6:               SENESE, VICTOR MR.   McCain, John S PALM BEACH GARDENS
##  7:              SHALLER, NELSON MR.   McCain, John S         BOCA RATON
##  8:            PRESCOTT, EDNA B. MS.   McCain, John S             STUART
##  9:            O'DONNELL, MERRY T MS    Obama, Barack         JUNO BEACH
## 10:                       KNAPP, AMY    Obama, Barack      COCONUT GROVE
## 11:                     NANCE, JONAS Clinton, Hillary INDIAN HARBOUR BEA
## 12:                    PEDRAZA, RAUL    Obama, Barack              MIAMI
## 13:          KRONGOLD, M. RONALD MR.   McCain, John S              MIAMI
## 14:                     GEO GOUP PAC   McCain, John S         BOCA RATON
## 15: OSI RESTAURANT PARTNERS INC. PAC   McCain, John S              TAMPA
## 16:                WESTREICH, HELENE Clinton, Hillary        BAL HARBOUR
## 17:                  BURTON, DEMETRA Clinton, Hillary           AVENTURA
## 18:                   WEAVER, DAPHNE Clinton, Hillary          MANALAPAN
## 19:                       BEAN, JOHN    Obama, Barack           SARASOTA
## 20:            SEMBLER, BRENT W. MR.   McCain, John S   SAINT PETERSBURG
##                 contbr_employer                  contbr_occupation
##  1:                NOT EMPLOYED                            RETIRED
##  2:                   HOMEMAKER                          HOMEMAKER
##  3:       G. & C. DE JESUS INC.                          PRESIDENT
##  4: THE A.D. MORGAN CORPORATION                 GENERAL CONTRACTOR
##  5:                  GB CAPITAL REAL ESTATE DEVELOPMENT/INVESTMENT
##  6:                     RETIRED                            RETIRED
##  7:            IHS DIALYSIS INC                                CEO
##  8:                   HOMEMAKER                          HOMEMAKER
##  9:                NOT EMPLOYED                            RETIRED
## 10:         UNITED HEALTH GROUP                          EXECUTIVE
## 11:           WEICHERT REALTORS                           SALESMAN
## 12:         MAGNO INTERNATIONAL      PRESIDENT CORPORATE EXECUTIVE
## 13:               SELF EMPLOYED                        REAL ESTATE
## 14:                                                               
## 15:                                                               
## 16:                NOT EMPLOYED                       NOT EMPLOYED
## 17:      BURTON ASSOC OF SO FLA               GOVERNMENT RELATIONS
## 18:                NOT EMPLOYED                          HOMEMAKER
## 19:                NOT EMPLOYED                            RETIRED
## 20:         THE SEMBLER COMPANY              REAL ESTATE EXECUTIVE
##        Total
##  1: 13950.00
##  2: 12800.00
##  3: 12300.00
##  4: 12300.00
##  5: 11500.00
##  6: 11500.00
##  7: 11500.00
##  8: 10900.00
##  9: 10850.00
## 10: 10645.18
## 11: 10200.00
## 12: 10142.50
## 13: 10000.00
## 14: 10000.00
## 15: 10000.00
## 16:  9900.00
## 17:  9200.00
## 18:  9200.00
## 19:  9200.00
## 20:  9200.00

Quite a few large donations supporting McCain. The occupations and employers do not appear to have a pattern.

##         contbr_city          cand_nm     Total
##  1:           MIAMI    Obama, Barack 2032341.4
##  2:           MIAMI Clinton, Hillary 1346698.8
##  3:          NAPLES   McCain, John S  925217.0
##  4:           TAMPA    Obama, Barack  891495.0
##  5:           MIAMI   McCain, John S  868417.3
##  6:     MIAMI BEACH    Obama, Barack  846736.6
##  7:    JACKSONVILLE    Obama, Barack  824205.9
##  8:    CORAL GABLES    Obama, Barack  768285.6
##  9:     TALLAHASSEE    Obama, Barack  762976.1
## 10:        SARASOTA    Obama, Barack  711810.0
## 11:    JACKSONVILLE   McCain, John S  705171.8
## 12:         ORLANDO    Obama, Barack  675680.5
## 13:           TAMPA   McCain, John S  668084.5
## 14:      BOCA RATON    Obama, Barack  663521.8
## 15:      BOCA RATON   McCain, John S  584809.5
## 16: FORT LAUDERDALE    Obama, Barack  583411.0
## 17:          NAPLES    Obama, Barack  552059.6
## 18:     MIAMI BEACH Clinton, Hillary  514105.7
## 19:      VERO BEACH   McCain, John S  511740.0
## 20:      PALM BEACH   McCain, John S  489871.0

Miami is a large city and has a lot of donation to many candidates. Naples has one of the highest per-capita income in the US.

##    region    value
## 1:        17151.99
## 2:  00000 13726.89
## 3:  00001   248.00
## 4:  00003   -40.00
## 5:  03313  1000.00
## 6:  03316  2050.00
## [1] 433   2

I subtracted republican donations in every zip code from democratic donations. Negative (red) zip codes have more republican donation money while positive (blue) zip codes have more democrat donation money.

The area around Tallahasee donated more to democrats while the area around Jacksonville donated more to republicans. The areas around Tampa and Miami are pretty mixed.

Lastly, there are 433 zip codes that do not belong to florida. Next I’ll plot cities instead of zip codes.

##            used (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells  1358237 72.6    2164898 115.7  2164898 115.7
## Vcells 11854625 90.5   62616268 477.8 97817987 746.3

The candidates raise a lot of money in the area around Miami. Gainsville made top 20 for only Obama and Ron Paul.

During the general election, the top cities that donated mostly overlapped. However, most cities in florida donated more to Obama than McCain.

##     contbr_employer               cand_nm     Total
##  1:    NOT EMPLOYED         Obama, Barack 5231425.9
##  2:         RETIRED        McCain, John S 4359725.0
##  3:   SELF EMPLOYED         Obama, Barack 3563870.5
##  4:    NOT EMPLOYED      Clinton, Hillary 2781359.0
##  5:   SELF EMPLOYED        McCain, John S 2028022.5
##  6:   SELF EMPLOYED      Clinton, Hillary 1613131.3
##  7:                          Romney, Mitt 1306232.2
##  8:                        McCain, John S 1216656.0
##  9:       HOMEMAKER        McCain, John S 1117562.5
## 10:                   Giuliani, Rudolph W  909353.3
## 11:                         Obama, Barack  860264.1
## 12:   SELF EMPLOYED   Giuliani, Rudolph W  635213.3
## 13:   SELF EMPLOYED          Romney, Mitt  430502.0
## 14:                 Thompson, Fred Dalton  323546.9
## 15:                      Clinton, Hillary  259098.2
## 16:                        Huckabee, Mike  251449.5
## 17:    NOT EMPLOYED             Paul, Ron  243911.7
## 18:   SELF EMPLOYED         Edwards, John  234299.2
## 19:    NOT EMPLOYED         Edwards, John  230850.0
## 20:       REQUESTED        Huckabee, Mike  202852.7

Sorted by employer, not very useful.

##     contbr_occupation             cand_nm     Total
##  1:           RETIRED      McCain, John S 4728666.0
##  2:           RETIRED       Obama, Barack 4007001.2
##  3:          ATTORNEY       Obama, Barack 2462331.8
##  4:           RETIRED    Clinton, Hillary 1593216.0
##  5:         HOMEMAKER      McCain, John S 1213036.5
##  6:          ATTORNEY    Clinton, Hillary  958995.3
##  7:                         Obama, Barack  811904.1
##  8:                        McCain, John S  787892.0
##  9:      NOT EMPLOYED       Obama, Barack  729532.0
## 10:           RETIRED        Romney, Mitt  726242.2
## 11:           RETIRED Giuliani, Rudolph W  723925.0
## 12:          ATTORNEY      McCain, John S  719383.7
## 13:         HOMEMAKER    Clinton, Hillary  655612.1
## 14:         HOMEMAKER       Obama, Barack  642493.1
## 15:         PHYSICIAN       Obama, Barack  613105.9
## 16:          ATTORNEY       Edwards, John  568175.9
## 17:         HOMEMAKER Giuliani, Rudolph W  460908.3
## 18:      NOT EMPLOYED    Clinton, Hillary  447059.0
## 19:         HOMEMAKER        Romney, Mitt  418004.0
## 20:         PHYSICIAN      McCain, John S  405667.0

Sorted by occupation, also not very useful.

##                  contbr_occupation               cand_nm   Total N
##  1:            RESTAURANT BUSINESS        McCain, John S 4600.00 1
##  2:                          PRES.        McCain, John S 4600.00 1
##  3:                     EXECTUTIVE        McCain, John S 4600.00 1
##  4: REAL ESTATE & ASSET MANAGEMENT          Romney, Mitt 3450.00 2
##  5:             CHAIRMAN/PRESIDENT        McCain, John S 3450.00 2
##  6:                       CHAIRMAN        Huckabee, Mike 2875.00 4
##  7:              AUTOMOBILE DEALER        McCain, John S 2425.25 4
##  8:                      INSURANCE Thompson, Fred Dalton 2300.00 3
##  9:       ADMINISTRATIVE ASSISTANT        Huckabee, Mike 2300.00 2
## 10:        REGIONAL VICE PRESIDENT        Huckabee, Mike 2300.00 1
## 11:        CHIEF FINANCIAL ADVISOR      Richardson, Bill 2300.00 1
## 12:              PRESIDENT AND COO      Richardson, Bill 2300.00 1
## 13:                AUDIO/RECORDING Thompson, Fred Dalton 2300.00 1
## 14:             CPA/WEALTH MANAGER      Clinton, Hillary 2300.00 2
## 15:  GOVERNMENT AFFAIRS CONSULTANT      Clinton, Hillary 2300.00 2
## 16:                 MOVIE PRODUCER      Clinton, Hillary 2300.00 2
## 17:                          BAKER      Clinton, Hillary 2300.00 1
## 18:               MANAGING PARTNER      Richardson, Bill 2300.00 1
## 19:       ADMINISTRASION EXECUTIVE      Clinton, Hillary 2300.00 2
## 20:                         AUTHOR      Richardson, Bill 2300.00 1

Sorted by average donation per occupation, most occupations only have 1 person. This means that occupations column needs further cleanup.

Boxplots show many outliers. McCain has quite a few donors at the 5k+ level. Most donations are under 3k, I will further investigate this below.

The violin plot shows an interesting difference in distributions. Donations under 3k are often trimodal: a large number of donations around 2.3k, a large number of donations around 1k and a wider group of donors w/a large tail around 100.

Romney and Giuliani both have a larger proportion of donors over 1.5k. Ron Paul and Obama both have more donors who donated less money. For Obama in particular has three large groups of donors who donated under 600.

Average Donation Size

On average, donation sizes for Guiliani and Romney were larger.

When we aggregate all donations from the same person (assuming one person per name per zip), the average donation increase and difference between candidates decrease. This implies that some donors donate very little but multiple times.

A large number of individuals donated over 10 times to Obama while only 1 person donated for Giuliani. I made the axis the same to make it easier to compare between the different candidates, however, limiting to 75 donations trunates Obama’s dataset.

##          cand_nm         contbr_nm contbr_city contbr_zip contbr_employer
## 1: Obama, Barack  NOGUERA, CARLOTA       MIAMI  331354726    NOT EMPLOYED
## 2: Obama, Barack GIARDINA, REBECCA     TAMARAC  333212562    NOT EMPLOYED
## 3: Obama, Barack  NOGUERA, CARLOTA       MIAMI  331354726    NOT EMPLOYED
##    contbr_occupation election_tp Party Total_donation N_donations
## 1:         HOMEMAKER       P2008     D        2300.00         267
## 2:           RETIRED       P2008     D        2303.13         132
## 3:         HOMEMAKER       G2008     D        2200.00         116

Suprisingly, 3 people donated over 100 times to Obama’s campaign!

Final Plots and Summary

Plot One

Description One

Plotting donations wihtin 2 years of the election date, cumulative donations can be relatively close during the primary but the eventual candidates raise the most money in the end.

At the beginning of January, 2008 the funding totals were

##                cand_nm   Total
## 1:    Clinton, Hillary 6101517
## 2: Giuliani, Rudolph W 5012007
## 3:      McCain, John S 2221205
## 4:       Obama, Barack 3554218
## 5:           Paul, Ron 1006869
## 6:        Romney, Mitt 3610918

Suprisingly, the candidates who won the primary did not have the most money raised at this point. Additional donations often stop (plateaus in chart) when each of the candidates drop out (date below). The drops in donations correspond to donors requesting refund.

  • Hillary Clinton withdrew on June 7, 2008
  • Rudy Giuliani withdrew on January 30, 2008
  • McCain, John S won primary (republican)
  • Obama, Barack won primary (democrat)
  • Ron Paul withdrew on June 12, 2008
  • Mitt Romney withdrew on February 7, 2008
##    Party    Total
## 1:     R 14116496
## 2:     D 12284194

By the end of the election, democrats had raised more money than republicans. However, at the beginning of January 2008 even though Hillary (a democrat) was leading (see numbers above), republicans had raised more money overall. The money raised by republicans was spread over more candidates.

Plot Two

Description Two

This violin plot shows a lot of information on what the donation size distributions for each of the candidates. I summarize all the donations from an individual (assuming only one donor with same name, location and occupation), remove negative values and plot donation amount from that donor. From this plot, one could make inferences about the net worth of the donors (based on how much they are able to donate).

Removing the restriction of total donation under 3000 magnifies the differences between the candidates’ donor bases. Obama and Ron Paul’s donor base often donate smaller amounts. Romney and Giuliani’s donor base on the other hand donate much more.

##                cand_nm Fraction_under_100
## 1:           Paul, Ron        0.114002478
## 2:       Obama, Barack        0.172848265
## 3:      McCain, John S        0.128640904
## 4:    Clinton, Hillary        0.140245548
## 5:        Romney, Mitt        0.025900566
## 6: Giuliani, Rudolph W        0.005715841
##                cand_nm Fraction_over_1000
## 1:           Paul, Ron          0.1280463
## 2:       Obama, Barack          0.1297157
## 3:      McCain, John S          0.2609946
## 4:    Clinton, Hillary          0.2236913
## 5:        Romney, Mitt          0.4614469
## 6: Giuliani, Rudolph W          0.4730539

The summarized data above quantifies the difference in donor base. Almost half of Romney and Giuliani’s donors donate over $1000 while under 3% donate under $100. In contrast, 17% of Obama’s donors donate under $100.

Plot Three

Description Three

I chose this plot because it shows the geospatial location of cities that supported each candidate. Most cities supported both candidates with more money going to Obama than McCain. Cities around Miami are the source of most donation donations. The donations plotted are only for general election.

##            cand_nm     Total        city_lookup
##  1:  Obama, Barack 924500.53           MIAMI,FL
##  2: McCain, John S 136557.00          NAPLES,FL
##  3:  Obama, Barack 401525.44     MIAMI BEACH,FL
##  4: McCain, John S 134075.00           MIAMI,FL
##  5:  Obama, Barack 395410.76           TAMPA,FL
##  6: McCain, John S 130270.00           TAMPA,FL
##  7:  Obama, Barack 377241.91    CORAL GABLES,FL
##  8: McCain, John S 106302.00      PALM BEACH,FL
##  9:  Obama, Barack 324421.72      BOCA RATON,FL
## 10: McCain, John S  76440.50      BOCA RATON,FL
## 11:  Obama, Barack 316618.34     TALLAHASSEE,FL
## 12: McCain, John S  74431.85    JACKSONVILLE,FL
## 13:  Obama, Barack 313374.15 FORT LAUDERDALE,FL
## 14: McCain, John S  73471.00      VERO BEACH,FL
## 15:  Obama, Barack 303758.70        SARASOTA,FL
## 16: McCain, John S  70566.00         ORLANDO,FL
## 17:  Obama, Barack 296083.74    JACKSONVILLE,FL
## 18: McCain, John S  61648.57 FORT LAUDERDALE,FL
## 19:  Obama, Barack 295288.48          NAPLES,FL
## 20: McCain, John S  41551.00     WINTER PARK,FL

This list of top 10 cities that raised money for Obama and McCain show that while many of the cities overlap, the amount raised differ dramatically.

##           cand_nm   Total
## 1:  Obama, Barack 9407917
## 2: McCain, John S 2218188
##            cand_nm      Pct        city_lookup
##  1:  Obama, Barack 9.826835           MIAMI,FL
##  2: McCain, John S 6.156241          NAPLES,FL
##  3:  Obama, Barack 4.267953     MIAMI BEACH,FL
##  4: McCain, John S 6.044348           MIAMI,FL
##  5:  Obama, Barack 4.202958           TAMPA,FL
##  6: McCain, John S 5.872812           TAMPA,FL
##  7:  Obama, Barack 4.009835    CORAL GABLES,FL
##  8: McCain, John S 4.792290      PALM BEACH,FL
##  9:  Obama, Barack 3.448391      BOCA RATON,FL
## 10: McCain, John S 3.446079      BOCA RATON,FL
## 11:  Obama, Barack 3.365446     TALLAHASSEE,FL
## 12: McCain, John S 3.355525    JACKSONVILLE,FL
## 13:  Obama, Barack 3.330962 FORT LAUDERDALE,FL
## 14: McCain, John S 3.312208      VERO BEACH,FL
## 15:  Obama, Barack 3.228756        SARASOTA,FL
## 16: McCain, John S 3.181245         ORLANDO,FL
## 17:  Obama, Barack 3.147176    JACKSONVILLE,FL
## 18: McCain, John S 2.779231 FORT LAUDERDALE,FL
## 19:  Obama, Barack 3.138723          NAPLES,FL
## 20: McCain, John S 1.873196     WINTER PARK,FL

Since the amounts raised by the two candidates differed so much, I also looked into the percent of donations that came from each city. The above table shows that Miami contributed contributed quite a large amount of Obama’s total donations (about 10%) while McCain’s largest city (Naples, FL - one of the wealthiest cities in the US) contributed only about 6% of McCain’s total. About 42% of Obama’s donations for general election came from the top 10 cities while 41% of McCain’s donations for general election came from the top 10 cities.

Reflection

Some of my early struggles when analyzing this dataset had to do with cleaning up the data and understanding why donations could be negative. I also spent a while trying to figure out the right comparisons, plotting all presidential candidates would make the plots look too complicated so I focused on the top 6 candidates (2 democrats, 4 republicans).

By focusing on the top candidates, the plots were easier to understand. I also spent a while figuring out the libraries to visualize geographically where the zip code boundaries and cities were. When I got them working I felt like it was much more meaningful than just listing zip codes without context.

Some future work that could be done with this dataset is to compare donor zip code to census estimates for population, income and racial breakdown for each zip code. One could also clean up the occupation column and instead use controlled vocabulary instead of free text. Another interesting dataset to overlay would be how many times a candidate visited a city and compare it to rate of donations coming in from that area along with the eventual vote results from election.